The first report on a new Tor species, Tor barakae Arunkumar & Basudha 2003, from Bangladesh using DNA barcoding technique

Mahseer are large-scale fish of the Cyprinidae family that inhabit South and Southeast Asian mountainous streams, rivers, and reservoirs. Tor tor and Tor putitora, two species of the Tor genus, were reportedly found in Bangladesh. This study aimed to confirm the species level of specimens collected from the Sangu River. The collected samples were identified using the DNA barcoding technique, followed by amplifying 645 bp of the cytochrome oxidase c subunit 1 gene (COI) using the FishF1/FishR1 universal primer. The sequence similarity was conducted using BOLD and NCBI databases which showed 99.85–100 % similarity to the reference genome. The genetic divergence between T. putitora vs. SRI, BT, and ST was found to be 0.0239, 0.0239, and 0.0238, respectively. The genetic divergence between T. tor vs. SRI, BT, and ST was 0.0272, 0.0272, and 0.0270, respectively. In the phylogenetic tree, two clusters were formed where collected specimens (SRI, BT, and ST) formed a subcluster with the reference genome (NC_056296.1 T. barakae) with 100 % bootstrap support. This study's findings revealed the presence of a new Tor species named Tor barakae in the Sangu River basin in Bangladesh.


Introduction
Freshwaters in South and Southeast Asia are a haven for Mahseer fishes of the Cyprinidae family.Large-scale Mahseers are regarded as economically valuable for aquaculture, capture fisheries, and as sport fish.They are mainly found in mountain streams with rocky bottoms but can also be found in rivers and lakes [1,2].There are a total of 59 mahseer species, among which 36 belong to the Tor genus, 22 to Neolissochilus, and one to Naziritor [3].Only those species that belong to the genus Tor are referred to as the 'true mahseers' [4,5].Similarly, 14 additional species of Tor were recognized in the trans-Himalayan region and Southeast Asia [4,6,2].The maximum species diversity for the Tor genus in southern Asia was found in the Indian subcontinent, with seven species spread across it [1].The two morphologically studied species of the Tor genus, named Tor tor and Tor putitora, are reportedly found in the mountainous streams of Sylhet, Netrokona, Mymensingh, Bandarban, and the Kaptai reservoir in Bangladesh [7].Mahseer's identity remained taxonomically ambiguous, particularly when considering its morphological characteristics [4].Taxonomists dispute the recognized morphological criteria for classifying the mahseer species, such as the relationship between head length and body depth [8] or the anatomy of the lip and median lobes [9].The integration of morphological and molecular techniques is crucial for accurately determining the taxonomic classification and conservation status of Tor species.DNA barcoding is a modern molecular technique capable of successfully identifying fish species while revealing intra-and interspecific variation [10].If properly applied, it can eliminate current fish misidentification and the availability of cryptic species that imitate and, at the same time, impede research, fishery management, and conservation efforts [11,12].
During the survey of hill stream fish, mahseer fish were found in the Sangu River, with distribution in Andarmanik, Boro Modok, and Ligri areas of the Sangu River in the Thanchi upazila of Bandarban.The water in these regions is deep with sandy or gravelly bottoms, which provide a favorable environment for Mahseer inhabitation.Locally, Mahseer is also known as Mikimau or Phorong, which is morphologically different from the other two (T.tor and T. putitora) existing species.Lack of knowledge about biology, population, and distribution keeps it out of the realm of conservation management.This study aimed to identify mahseer fishes at the species level using morphological and molecular approaches.

Sample collection and preservation
Three morphologically distinct fish samples were collected from the Sangu River in the Thanchi upazila of Bandarban (Fig. 1).The samples were labeled as Sangu River 1 (SR1), Bandarban Tor (BT), and Sangu Tor (ST).A clear photograph was taken and presented in Fig. 2. For the molecular part of the study, each sample was thoroughly cleansed with distilled water to eliminate any potential impurities, followed by excising 100 mg of caudal fins using sterile scissors, appropriately labeled, stored in 95 % ethanol, and kept at − 20 • C till further use.The ethical standards outlined for animal handling were strictly followed throughout the study while handling fish specimens.

Morphological approach
Morphometric and meristic characters of each fish were recorded by following the method reported by Gharaei [13] with slight modifications.The abbreviations used for different morphological characteristics were total length (TL), standard length (SL), head length (HL), body depth (BD), dorsal fin height (DFH), pectoral fin height (PFH), pelvic fin height (PeFH), and anal fin height (AFH), respectively.
The thermal cycle sequences used to conduct each reaction in a 96-well thermal cycler (ASTEC Thermal Cycler GeneAtlas, Japan) were: 2 min for initial denaturation at 95 • C, 35 cycles comprised at 94 • C for 35 s, annealing primer at 54 • C for 30 s, primer extension at 72 • C for 1 min, and final extension for 10 min at 72 • C. The COI-amplified products were separated using a 1.2 % agarose gel.A commercial gel purification kit (PureLinkTM PCR purification, Thermo Fisher Scientific, USA) was used to purify PCR products, which were later sequenced by Apical Scientific Sdn Bhd, Malaysia.

Sequenced data analysis
After receiving the sequencing data, the raw data were manually checked and edited using the BioEdit program [15].The basic local alignment search tool (BLAST), an online tool from the National Centre for Biotechnology Information (NCBI) to assess the homology with the GenBank data, was used to analyze the mtDNA COI gene nucleotide sequences.The nucleotide sequences were also compared to the barcode of life data (BOLD) database for species determination.
The ClustalW program included in MEGA-X was used to align the nucleotide sequences with each other [16], with a gap opening penalty of 15 and a gap extension penalty of 6.6.Using the maximum composite likelihood approach [17] and the Kimura-2 Parameter (K2P) model [18], the pairwise genetic distance between sample sequences was estimated in MEGA-X [18].A phylogenetic tree was constructed using the maximum likelihood method [16,17] and the K2P model [18] to estimate the taxonomic relationship, with a bootstrap value of 1000 [19].The phylogenetic tree was modified in iTOL v4 [20], followed by QR code generation using a QR code generator (https://goqr.me/).

Morphological identification of the collected sample
The general morphological characteristic in terms of the body surface color of the collected specimens was darkish blue on the back with a reddish ventral side, with the anal fin of the fish reaching the caudal fin base (Fig. 2).Morphometric and meristic characters were enlisted in Tables 1 and 2. HL and BD ratios of the collected samples (SRI, ST, and BT) were 0.73:1, 0.66:1, and 0.69:1, respectively.

Molecular identification of the collected samples
A FishF1/FishR1 universal primer was used to amplify the mtDNA COI gene in three collected samples.After the PCR products were electrophoresed, the gel documentation system showed clear and distinct DNA bands, cementing the righteous PCR condition setting.The nucleotide sequence length of the three specimens was 645 bp, devoid of insertions, deletions, or stop codons in any of the sequences.The sequence similarity data using BOLD and NCBI databases are given in Table 3.Following submitting the nucleotide  sequence to NCBI, the provided accession numbers for SR1, and ST were OQ694386, OQ694387, and OQ694388.

Genetic diversity and divergence analysis
Pair-wise genetic distance was analyzed using the Kimura 2-parameter (K2P) model in the Mega X program, followed by comparing genetic distance with the reference genome of a voucher specimen collected from NCBI.The results indicated no genetic diversity among the collected samples (SRI, BT, and ST).However, a momentous divergence was observed when the samples were compared to the other seven species within the Tor genus.The slightest genetic divergence was found to be 0.0144, 0.0144, and 0.0143 for T. mussullah vs. SRI, BT, and ST.Similarly, the highest genetic divergence observed were 0.0272, 0.0272, and 0.0270 for T. tor vs. SRI, BT, and ST, respectively (Fig. 3).

Phylogenetic tree
The nucleotide sequences of collected samples were used to reconstruct a maximum likelihood phylogenetic tree.The Kimura 2parameter (K2P) model with a 1000 bootstrap value was used to measure evolutionary genetic distance.In the phylogenetic tree, two clusters were formed where SRI, BT, and ST formed a sub-cluster with the reference genome (NC_056296.1 T. barakae) (Fig. 4) with 100 % bootstrap support and Lepidocephalichthys annandalei used as an outgroup.

QR code for the identified species
A QR code was generated using a QR code generator (Fig. 5) to represent the identified fish species.The QR code scanning facilitated the accessibility of nucleotide sequences and comprehensive data of identified fish species.

Discussion
Mahseer species have negligible morphological differences, making distinguishing them challenging, thereby making conservation strategies difficult [2].Various literary accounts are inconsistent or fail to account for the variety of species found in Indian mahseers [21].Compared to BD, HL is significantly shorter in the case of T. barakae [2,22].The HL and BD ratios of SRI, BT, and ST were found to be 0.73:1, 0.66:1, and 0.69:1, respectively, suggesting significantly shorter HL than BD compared to T. putitora, with a ratio of 0.85:1 [23].
DNA-based identification approaches could be a crucial analytical substitute or complement to clarify the confusion, while DNA barcoding is a novel technique for swiftly, precisely, and automatically identifying various species.The COI barcoding region had been selected as the primary molecular diagnostic method for identifying animal species [24].Data from mtCOI and mtCytb analysis using various species delimitation techniques also demonstrated the legitimacy of T. barakae species, which were evidently distinct from other congeners [2].The average homology of mtCOI sequences was 99.95 % in BLAST, and 100 % in BOLD analysis (Table 3), which was reasonably satisfactory.
Low inter-species divergence (K2P) in the genus Tor has been reported earlier [25].It was discovered that the genetic divergence between the species of the Tor genus ranged from 0.000 to 0.037 [6], respectively.Similarly, the genetic divergence between T. barakae and T. tor was reported to be 0.0263 [2].In this study, the genetic divergence between T. putitora vs. SRI, BT, and ST was found to be 0.0239, 0.0239, and 0.0238.The genetic divergence between T. tor vs. SRI, BT, and ST was found to be 0.0272, 0.0272, and 0.0270, respectively, which corroborated well with previously published reports [6,2,25].No genetic diversity among SRI, BT, and ST  indicated that they are genetically uniform.Two clusters were formed the maximum likelihood phylogenetic tree reconstructed from the COI sample sequences (Fig. 4).SRI, BT, and ST formed clusters with the reference genome (NC_056296.1 T. barakae).Since no specimens formed a cluster with an outgroup, it indicated that SRI, BT, and ST were the T. barakae specimens.Individual samples are classified into phylogenetic branches per their taxonomic affinity, rendering members of the same species to be closely clustered [26].
The generated QR code provided the nucleotide sequence and comprehensive information about the detected fish, which can be easily read using a smartphone with a QR scanner without needing additional hardware.Regarding population, biology, and distribution at the micro level, T. barakae has yet to be well known.The species is listed as near threatened (NT) on the IUCN Red List due to its limited distribution and threats to its habitat in India [27].Its population is also declining in Bangladesh due to overfishing, habitat destruction, and various other anthropogenic factors.

Conclusion
This study revealed that a novel Tor species, T. barakae, was found within the geographical confines of the Sangu River basin in Bangladesh.Further research is necessary to implement complete conservation strategies, such as habitat management, protection, and artificial reproduction.

Fig. 2 .
Fig. 2. Morphometric variables differentiate the specimen from other Tor species.Where nm -head length, ab-body depth, and k -the caudal fin's tip.Scale: 5 cm.
Md. Amdadul Haque, MS: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Wrote the paper.Jonaira Rashid, PhD: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted

Fig. 3 .
Fig. 3. Pairwise genetic distance of collected specimens (SRI, BT, and ST) with the reference genome.

Fig. 4 .
Fig. 4. The maximum likelihood phylogenetic tree was reconstructed from the collected specimens (SRI, BT, and ST) with the reference genome.
M.A. Haque et al. the data.Md.Lipon Mia, MS; Md.Azhar Ali, PhD: Performed the experiments.Md.Khaled Rahman, MS: the experiments; Analyzed and interpreted the data.Anuradha Bhadra, PhD: Contributed reagents, materials, analysis tools or data.Yahia Mahmud, PhD: Conceived and designed the experiments; Contributed reagents, materials, analysis tools or data.

Table 1
Morphometric characters of the collected specimens.
M.A.Haque et al.

Table 2
Meristic characters of the collected specimens.

Table 3
Nucleotide BLAST analysis with the reference genome to identify the samples.